-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Add "silentpayments" module implementing BIP352 (take 4, limited to full-node scanning) #1765
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Add "silentpayments" module implementing BIP352 (take 4, limited to full-node scanning) #1765
Conversation
|
Added the optimized version on top of this PR: For more context: |
|
Small supplementary update: I've created a corresponding Python implementation of the provided API functions based on secp256k1lab (https://github.com/theStack/secp256k1lab/blob/add_bip352_module_review_helper/src/secp256k1lab/bip352.py) (also linked in the PR description). The hope is that this makes reviewing this PR a bit easier by having a less noisy, "executable pseudo-code"-like description on what happens under the hood. The code passes the BIP352 test vectors and hence should be correct.
Thanks for rebasing on top of this PR, much appreciated! I will take a closer look within the next days. |
w0xlt
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Not related to optimization, but the diff below removes some redundant public-key serialization code:
diff --git a/src/modules/silentpayments/main_impl.h b/src/modules/silentpayments/main_impl.h
index 106da20..922433d 100644
--- a/src/modules/silentpayments/main_impl.h
+++ b/src/modules/silentpayments/main_impl.h
@@ -21,6 +21,19 @@
/** magic bytes for ensuring prevouts_summary objects were initialized correctly. */
static const unsigned char secp256k1_silentpayments_prevouts_summary_magic[4] = { 0xa7, 0x1c, 0xd3, 0x5e };
+/* Serialize a ge to compressed 33 bytes. Keeps eckey_pubkey_serialize usage uniform
+ * (expects non-const ge*), and centralizes the VERIFY_CHECK. */
+static SECP256K1_INLINE void secp256k1_sp_ge_serialize33(const secp256k1_ge* in, unsigned char out33[33]) {
+ size_t len = 33;
+ secp256k1_ge tmp = *in;
+ int ok = secp256k1_eckey_pubkey_serialize(&tmp, out33, &len, 1);
+#ifdef VERIFY
+ VERIFY_CHECK(ok && len == 33);
+#else
+ (void)ok;
+#endif
+}
+
/** Sort an array of silent payment recipients. This is used to group recipients by scan pubkey to
* ensure the correct values of k are used when creating multiple outputs for a recipient.
*
@@ -68,13 +81,11 @@ static int secp256k1_silentpayments_calculate_input_hash_scalar(secp256k1_scalar
secp256k1_sha256 hash;
unsigned char pubkey_sum_ser[33];
unsigned char input_hash[32];
- size_t len;
int ret, overflow;
secp256k1_silentpayments_sha256_init_inputs(&hash);
secp256k1_sha256_write(&hash, outpoint_smallest36, 36);
- ret = secp256k1_eckey_pubkey_serialize(pubkey_sum, pubkey_sum_ser, &len, 1);
- VERIFY_CHECK(ret && len == sizeof(pubkey_sum_ser));
+ secp256k1_sp_ge_serialize33(pubkey_sum, pubkey_sum_ser);
secp256k1_sha256_write(&hash, pubkey_sum_ser, sizeof(pubkey_sum_ser));
secp256k1_sha256_finalize(&hash, input_hash);
/* Convert input_hash to a scalar.
@@ -85,15 +96,13 @@ static int secp256k1_silentpayments_calculate_input_hash_scalar(secp256k1_scalar
* an error to ensure strict compliance with BIP0352.
*/
secp256k1_scalar_set_b32(input_hash_scalar, input_hash, &overflow);
- ret &= !secp256k1_scalar_is_zero(input_hash_scalar);
+ ret = !secp256k1_scalar_is_zero(input_hash_scalar);
return ret & !overflow;
}
static void secp256k1_silentpayments_create_shared_secret(const secp256k1_context *ctx, unsigned char *shared_secret33, const secp256k1_ge *public_component, const secp256k1_scalar *secret_component) {
secp256k1_gej ss_j;
secp256k1_ge ss;
- size_t len;
- int ret;
secp256k1_ecmult_const(&ss_j, public_component, secret_component);
secp256k1_ge_set_gej(&ss, &ss_j);
@@ -103,12 +112,7 @@ static void secp256k1_silentpayments_create_shared_secret(const secp256k1_contex
* impossible at this point considering we have already validated the public key and
* the secret key.
*/
- ret = secp256k1_eckey_pubkey_serialize(&ss, shared_secret33, &len, 1);
-#ifdef VERIFY
- VERIFY_CHECK(ret && len == 33);
-#else
- (void)ret;
-#endif
+ secp256k1_sp_ge_serialize33(&ss, shared_secret33);
/* Leaking these values would break indistinguishability of the transaction, so clear them. */
secp256k1_ge_clear(&ss);
@@ -585,7 +589,6 @@ int secp256k1_silentpayments_recipient_scan_outputs(
secp256k1_ge output_negated_ge, tx_output_ge;
secp256k1_gej tx_output_gej, label_gej;
unsigned char label33[33];
- size_t len;
secp256k1_xonly_pubkey_load(ctx, &tx_output_ge, tx_outputs[j]);
secp256k1_gej_set_ge(&tx_output_gej, &tx_output_ge);
@@ -595,7 +598,6 @@ int secp256k1_silentpayments_recipient_scan_outputs(
secp256k1_ge_neg(&output_negated_ge, &output_ge);
secp256k1_gej_add_ge_var(&label_gej, &tx_output_gej, &output_negated_ge, NULL);
secp256k1_ge_set_gej_var(&label_ge, &label_gej);
- ret = secp256k1_eckey_pubkey_serialize(&label_ge, label33, &len, 1);
/* Serialize must succeed because the point was just loaded.
*
* Note: serialize will also fail if label_ge is the point at infinity, but we know
@@ -603,7 +605,7 @@ int secp256k1_silentpayments_recipient_scan_outputs(
* Thus, we know that label_ge = tx_output_gej + output_negated_ge cannot be the
* point at infinity.
*/
- VERIFY_CHECK(ret && len == 33);
+ secp256k1_sp_ge_serialize33(&label_ge, label33);
label_tweak = label_lookup(label33, label_context);
if (label_tweak != NULL) {
found = 1;
@@ -617,7 +619,6 @@ int secp256k1_silentpayments_recipient_scan_outputs(
secp256k1_gej_neg(&label_gej, &tx_output_gej);
secp256k1_gej_add_ge_var(&label_gej, &label_gej, &output_negated_ge, NULL);
secp256k1_ge_set_gej_var(&label_ge, &label_gej);
- ret = secp256k1_eckey_pubkey_serialize(&label_ge, label33, &len, 1);
/* Serialize must succeed because the point was just loaded.
*
* Note: serialize will also fail if label_ge is the point at infinity, but we know
@@ -625,7 +626,7 @@ int secp256k1_silentpayments_recipient_scan_outputs(
* Thus, we know that label_ge = tx_output_gej + output_negated_ge cannot be the
* point at infinity.
*/
- VERIFY_CHECK(ret && len == 33);
+ secp256k1_sp_ge_serialize33(&label_ge, label33);
label_tweak = label_lookup(label33, label_context);
if (label_tweak != NULL) {
found = 1;
w0xlt
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: The following diff removes the implicit cast and clarifies that k is 4 bytes
diff --git a/src/modules/silentpayments/main_impl.h b/src/modules/silentpayments/main_impl.h
index 922433d..d94aed6 100644
--- a/src/modules/silentpayments/main_impl.h
+++ b/src/modules/silentpayments/main_impl.h
@@ -512,7 +512,8 @@ int secp256k1_silentpayments_recipient_scan_outputs(
secp256k1_xonly_pubkey output_xonly;
unsigned char shared_secret[33];
const unsigned char *label_tweak = NULL;
- size_t j, k, found_idx;
+ size_t j, found_idx;
+ uint32_t k;
int found, combined, valid_scan_key, ret;
/* Sanity check inputs */
jonasnick
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @theStack for the new PR. I can confirm that this PR is a rebased version of #1698, with the light client functionality removed and comments addressed, except for:
- #1698 (comment)
- #1698 (comment)
- #1698 (review) (only the last one, "elemement")
Is there a reason for serializing prevouts_summary without light client functionality? If not, I think the don't do this comment is sufficient. Right now, in contrast to the docs of all other opaque objects, this is missing, however:
|
c11d30c to
445f2e8
Compare
|
@w0xlt, @jonasnick: Thanks for the reviews! I've addressed the suggested changes:
Given that this compressed-pubkey-serialization pattern shows up repeatedly also in other modules (ellswift, musig), I think it would make the most sense to add a general helper (e.g. in eckey{,_impl}.h), which could be done in an independent PR. I've opened issue #1773 to see if there is conceptual support for doing this.
Good point, I can't think of a good reason for full nodes wanting to serialize prevouts_summary. |
445f2e8 to
9103229
Compare
|
To address the open questions, I’ve reviewed the proposed changes by @w0xlt on 8d16914. I'm going to focus more on the key aspects I extracted from the review and the merits of each change, rather on the big O improvement claims, because I didn't get that far. These are multiple different changes rather than a single one, so to make the review easier I suggest to brake it in multiple commits. I would state on each of them the purpose and the real case scenario where the change would be relevant. Also, I would use clearer names for the variables or at least document their purpose. The changes I've identified so far are the following:
In general I agree with @jonasnick that we should define a clear target to benchmark and improve. As I've said before, the base case should be a wallet with a single label for change. In conclusion, from the proposed commit and the discussion around it, the only changes I've found clear enough to consider are:
|
|
Thanks @nymius for reviewing the changes, addressing the main points, and proposing a simplification. I’m currently splitting the optimization commit into smaller pieces to make it easier to review. The only part of the discussion that still feels a bit ambiguous is the “base” or “usual” case. So the goal of this optimization would be to mitigate that scenario, not the collaborative one. |
|
I ran the Without the Answering the questions: The optimized receiver implementation relies on a heuristic (the
|
|
@w0xlt, @nymius: Thanks for investigating this deeper. I've now also had a chance to look at the suggested optimizations and came to similar conclusions as stated in #1765 (comment). I particularly agree with the stated points that the changes should not increase complexity significantly and that the most important optimization candidate to consider for mitigating the worst-case scanning attack is "skip outputs that we have already found" (as previously stated by @jonasnick, see #1698 (comment) and jonasnick@311b4eb). I don't think stabilizing the sorting helps at all, since this is something that happens at the sender side, and we can't rely on the attacker using a specific implementation (even if they did, it's trivial for them to shuffle the outputs after creation). For the proposed target to benchmark, I'm proposing the following modified example that exhibits the worst-case scanning time based on a labels cache with one entry (for change outputs), by creating a tx with 23255 outputs [1] all targeted for Bob: 1df4287 Shower-thought from this morning: what if we treat the Any thoughts on this? Maybe I'm still missing something. [1] that's an upper bound of maximum outputs per block: floor(1000000/43) = 23255 |
1df4287 is a good target.
I had to increase stack size to be able to fit all N_OUTPUT size allocations in the example. Initially I preferred the |
|
@theStack Yes — if we want to keep only the adversarial-scenario optimizations, we can drop sort stabilization and the extra heads. I like your idea of avoiding dynamic memory allocation. That’s a very interesting direction. On my machine, the scan completes in about 0.4s, which feels like a good balance between simplicity and the optimization needed for the labeled case. Below are the changes I had to make for your example to run on my machine and to record the scan time. diff --git a/examples/silentpayments.c b/examples/silentpayments.c
index 5e71e73..d43332f 100644
--- a/examples/silentpayments.c
+++ b/examples/silentpayments.c
@@ -10,6 +10,7 @@
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
+#include <time.h>
#include <secp256k1_extrakeys.h>
#include <secp256k1_silentpayments.h>
@@ -112,15 +113,21 @@ const unsigned char* label_lookup(
return NULL;
}
+static secp256k1_xonly_pubkey tx_inputs[N_INPUTS];
+static const secp256k1_xonly_pubkey *tx_input_ptrs[N_INPUTS];
+static secp256k1_xonly_pubkey tx_outputs[N_OUTPUTS];
+static secp256k1_xonly_pubkey *tx_output_ptrs[N_OUTPUTS];
+static secp256k1_silentpayments_found_output found_outputs[N_OUTPUTS];
+static secp256k1_silentpayments_found_output *found_output_ptrs[N_OUTPUTS];
+static secp256k1_silentpayments_recipient recipients[N_OUTPUTS];
+static const secp256k1_silentpayments_recipient *recipient_ptrs[N_OUTPUTS];
+/* 2D array for holding multiple public key pairs. The second index, i.e., [2],
+ * is to represent the spend and scan public keys. */
+static unsigned char (*sp_addresses[N_OUTPUTS])[2][33];
+
int main(void) {
unsigned char randomize[32];
unsigned char serialized_xonly[32];
- secp256k1_xonly_pubkey tx_inputs[N_INPUTS];
- const secp256k1_xonly_pubkey *tx_input_ptrs[N_INPUTS];
- secp256k1_xonly_pubkey tx_outputs[N_OUTPUTS];
- secp256k1_xonly_pubkey *tx_output_ptrs[N_OUTPUTS];
- secp256k1_silentpayments_found_output found_outputs[N_OUTPUTS];
- secp256k1_silentpayments_found_output *found_output_ptrs[N_OUTPUTS];
secp256k1_silentpayments_prevouts_summary prevouts_summary;
secp256k1_pubkey unlabeled_spend_pubkey;
struct labels_cache bob_labels_cache;
@@ -209,11 +216,6 @@ int main(void) {
{
secp256k1_keypair sender_keypairs[N_INPUTS];
const secp256k1_keypair *sender_keypair_ptrs[N_INPUTS];
- secp256k1_silentpayments_recipient recipients[N_OUTPUTS];
- const secp256k1_silentpayments_recipient *recipient_ptrs[N_OUTPUTS];
- /* 2D array for holding multiple public key pairs. The second index, i.e., [2],
- * is to represent the spend and scan public keys. */
- unsigned char (*sp_addresses[N_OUTPUTS])[2][33];
unsigned char seckey[32];
printf("Sending...\n");
@@ -340,6 +342,9 @@ int main(void) {
* `secp256k1_silentpayments_recipient_prevouts_summary_create`
* 2. Call `secp256k1_silentpayments_recipient_scan_outputs`
*/
+ clock_t start, end;
+ double cpu_time_used;
+
ret = secp256k1_silentpayments_recipient_prevouts_summary_create(ctx,
&prevouts_summary,
smallest_outpoint,
@@ -356,14 +361,20 @@ int main(void) {
/* Scan the transaction */
n_found_outputs = 0;
+
+ start = clock();
ret = secp256k1_silentpayments_recipient_scan_outputs(ctx,
found_output_ptrs, &n_found_outputs,
- (const secp256k1_xonly_pubkey * const *)tx_output_ptrs, N_OUTPUTS,
+ (const secp256k1_xonly_pubkey **)tx_output_ptrs, N_OUTPUTS,
bob_scan_key,
&prevouts_summary,
&unlabeled_spend_pubkey,
label_lookup, &bob_labels_cache /* NULL, NULL for no labels */
);
+ end = clock();
+ cpu_time_used = ((double) (end - start)) / CLOCKS_PER_SEC;
+ printf("Bob's scan took %f seconds\n", cpu_time_used);
+
if (!ret) {
printf("This transaction is not valid for Silent Payments, skipping.\n");
return EXIT_SUCCESS;
@@ -435,7 +446,7 @@ int main(void) {
n_found_outputs = 0;
ret = secp256k1_silentpayments_recipient_scan_outputs(ctx,
found_output_ptrs, &n_found_outputs,
- (const secp256k1_xonly_pubkey * const *)tx_output_ptrs, 1, /* dummy scan with one output (we only care about Bob) */
+ (const secp256k1_xonly_pubkey **)tx_output_ptrs, 1, /* dummy scan with one output (we only care about Bob) */
carol_scan_key,
&prevouts_summary,
&unlabeled_spend_pubkey, |
|
@nymius, @w0xlt: Thanks once again for the quick feedback and for benchmarking! Shortly after my previous comment, I've been notified about yet another approach to tackle the worst-case scanning time attack (kudos to @furszy for bringing up the idea!), that I think is even more elegant: we can use the pointers in the The only tiny drawback about these non-malloc approaches might be that something that is conceptually an "in" parameter is modified, which might be a bit unsound in a strict API design sense. On the other hand, it shouldn't matter for the user (I doubt that these lists passed in would ever be reused for anything else after by the callers), and we already do the same in the sending API for the recipients, so it's probably fine.
The way I see it currently, code paths for non-adversarial scenarios with increasing k values would be hit so rarely in practice, that I'm sceptical that it's worth it put much effort into those optimizations. When scanning, the vast majority of transactions won't have any matches in the first place. Out of those few that do have a match, the vast majority will very likely again not contain any repeated recipient (IMHO it doesn't make that much sense to do that, unless the recipient explicitly asks "I want to receive my payment split up in multiple UTXOs, but still in a single tx"?), so in the bigger picture those optimizations wouldn't matter all that much, and I'd assume that the dominant factor should be by far all the (unavoidable) ECDH computations per transaction. But that's still more of a guess and it's still good to already have optimization ideas at hand if we need them in the future. |
|
@theStack Thanks for continuing to refine the optimization. The deletion approach performs slightly better (0.40 s vs. 0.45 s), likely because deleting items shrinks the array and cuts the number of loop iterations by about 50% compared to nullifying them. |
assuming the labels cache has only one entry (for change) for now includes fixes by w0xlt in order to avoid running into a stack overflow and time measureing code, see bitcoin-core#1765 (comment)
assuming the labels cache has only one entry (for change) for now includes fixes by w0xlt in order to avoid running into a stack overflow and time measureing code, see bitcoin-core#1765 (comment)
9103229 to
650b2fb
Compare
|
To summarize, the following table shows the proposed mitigations for the worst-case scanning attack so far, with benchmark results from my machine. The previous baseline commit with the worst-case example has been updated to include @w0xlt's changes, in order to work without stack size limit changes.
The run-times of the fixes vary slightly (the removal approach "fix2" being the fastest, confirming #1765 (comment) above), but are all in the same ballpark. I don't think exact performance results matter much here, as the goal of the mitigation should be to IMHO roughly cut the run-time down from "minutes" to "seconds" (and remember, this is already for the absolute worst-case, one giant non-standard transaction filling out a whole block, and it can only slow down one specific SP recipient). Thus, I decided to pick the the simplest approach that avoids dynamic memory allocation, i.e. fix number 3 using With that tackled, I believe that all of the open questions and TODOs are addressed now (updated the PR description accordingly). The latest force-push also includes a rebase on master (to include the CI fix #1771). |
Add a routine for the entire sending flow which takes a set of private keys,
the smallest outpoint, and list of recipients and returns a list of
x-only public keys by performing the following steps:
1. Sum up the private keys
2. Calculate the input_hash
3. For each recipient group:
3a. Calculate a shared secret
3b. Create the requested number of outputs
This function assumes a single sender context in that it requires the
sender to have access to all of the private keys. In the future, this
API may be expanded to allow for a multiple senders or for a single
sender who does not have access to all private keys at any given time,
but for now these modes are considered out of scope / unsafe.
Internal to the library, add:
1. A function for creating shared secrets (i.e., a*B or b*A)
2. A function for generating the "SharedSecret" tagged hash
3. A function for creating a single output public key
Add function for creating a label tweak. This requires a tagged hash function for labels. This function is used by the receiver for creating labels to be used for a) creating labeled addresses and b) to populate a labels cache when scanning. Add function for creating a labeled spend pubkey. This involves taking a label tweak, turning it into a public key and adding it to the spend public key. This function is used by the receiver to create a labeled silent payment address. Add tests for the label API.
Add routine for scanning a transaction and returning the necessary spending data for any found outputs. This function works with labels via a lookup callback and requires access to the transaction outputs. Requiring access to the transaction outputs is not suitable for light clients, but light client support is enabled in the next commit. Add an opaque data type for passing around the prevout public key sum and the input hash tweak (input_hash). This data is passed to the scanner before the ECDH step as two separate elements so that the scanner can multiply the scan_key * input_hash before doing ECDH. Finally, add test coverage for the receiving API.
Demonstrate sending and scanning on full nodes.
Add a benchmark for a full transaction scan. Only benchmarks for scanning are added as this is the most performance critical portion of the protocol. Co-authored-by: Sebastian Falbesoner <[email protected]>
Add the BIP-352 test vectors. The vectors are generated with a Python script that converts the .json file from the BIP to C code: $ ./tools/tests_silentpayments_generate.py test_vectors.json > ./src/modules/silentpayments/vectors.h Co-authored-by: Ron <[email protected]> Co-authored-by: Sebastian Falbesoner <[email protected]> Co-authored-by: Tim Ruffing <[email protected]>
Co-authored-by: Jonas Nick <[email protected]> Co-authored-by: Sebastian Falbesoner <[email protected]>
Test midstate tags used in silent payments.
650b2fb to
f96f41f
Compare
Yes, done. Using the new |
|
Thanks for the summary, I've checked the results. I agree with |
|
Hi @theStack . That's not actually the worst case. The worst case is when the attacker shuffles the outputs randomly (or in any adversarial order). I modified your benchmark to randomly shuffle outputs, and scanning takes several minutes to hours with the current optimization proposal. The issue is that the scanning function receives a label lookup callback, which makes it harder to build a sorted index upfront. Instead, for each output tweak index k, it linearly scans all transaction outputs — resulting in O(n²) complexity. An attacker creating ~23,000 unordered outputs can make scanning take hours. If we slightly change the API to replace the label lookup callback with a label entry set, we can sort once and search fast:
That way, scanning 23,255 adversarially-ordered outputs drops from hours to ~0.3s. The optimization is purely receiver-side, requires no protocol changes, and works regardless of output ordering. This reduces architectural flexibility, though. Before: for each k, scan all n outputs → O(k × n) ≈ O(n²) The following commit implements this proposal:
You can verify by running |
@w0xlt: Good catch, I indeed missed involving the output order for the worst-case scenario, so theStack@c16252d is not a useful target (in hindsight, the benchmark results look too good to be true, and my previous words of "we can't rely on the attacker using a specific implementation" strike back very hard). Thanks for the updated benchmark and the new worst-case fix proposal!
Are you sure it could take up "to hours"? That would be significantly worse than the previous baseline commit (~10 minutes on my machine), which seems implausible to me. In the unoptimized scanning implementation, the run-time should be very much independent on the order of outputs, as the total number of inner loop iterations is constant ( I only did two runs so far with your shuffle-patch added, and got runtimes of ~10 minutes (unoptimized) and ~5 minutes (NULL-ify patch, "fix3"). Will do some more thorough benchmarks tomorrow, including your proposed solution. |
|
@theStack Thanks for looking into this. |
|
@w0xlt: I've looked at your proposed scanning implementation w0xlt@2305d73 and can confirm that it fixes the worst-case scanning attack, as the run-time of the modified example with shuffled outputs is reduced to <1s. What makes this different approach potentially interesting is that it could also to speed up the common case (i.e. no matches), as the number of point additions needed for labels scanning seems to be significantly smaller for the small-scale user (where N_LABELS < 2 * N_OUTPUTS for the average transaction; if N_LABELS=1 and we assume the average transaction to scan for has N_OUTPUTS=2, your approach is 4x faster w.r.t. needed point additions). To verify that I understood it correctly and make the idea digestible to other interested reviewers, I've tried to summarize the current PR's scanning approach (as proposed in BIP-352) vs. your approach on the following gist, written in a somewhat hand-wavy way, I hope the pseudo-code is understandable: https://gist.github.com/theStack/25c77747838610931e8bbeb9d76faf78. Please let me know if I got that right, I think the approach leads to the same result, but I'm still trying to wrap my head around it. On the other hand, I'm still uncertain on whether adding the additional code complexity and the involvement of dynamic memory allocation at all in a secp256k1 module is worth all this, considering that we are talking about an attack that costs an adversary money (at least the fees for the whole block space) and can only slow down one specific SP recipient. It seems a very unusual kind of attack. Changing the scanning API at this point seems also questionable, as I think it went already through several iterations and refinements over the years, and I think the "pass labels directly" approach is slower for users with a higher number of labels involved. Also, the mental load for reviewers is significantly higher if the approach doesn't follow the algorithm as specified in BIP-352, as they have to convince themselves that the result is equivalent. I'm hence debating on whether one solution could be to keep the PR simply as-is and accept the fact that an adversary can slow down a single entity for a few minutes (for non-standard transactions), to a few seconds (for standard transactions, i.e. tx size <= 100kvB). I don't have a good final answer to this question myself, but want to discuss is at least and come to a decision that both maintainers and users are on board with before taking any further action. My (in retrospect somewhat naive) hope was that there could be a simple few-lines fix for reducing the worst-case scanning attack down to seconds without the need of dynamic memory allocation, but such an easy fix doesn't seem to be in sight. |
|
Hi @theStack thanks a lot for taking the time to review the proposal and write up the comparison. My view is that we should aim for the most efficient and resilient solution, even if the attack scenario is somewhat unusual. The spec can be updated as needed to reflect improvements. Independently of adversarial cases, reducing scanning time from several minutes to under a second is a major win for user experience. Regarding complexity: although this approach adds some moving parts, the underlying idea (e.g., using binary search) is conceptually straightforward, so I don’t think it should pose a significant burden by itself. As for the dynamic memory concern, the allocation doesn’t necessarily have to happen inside the secp256k1 module. We could let the caller provide the required workspace. For example, could we leverage |
|
@w0xlt: After tinkering a bit with your proposed implementation w0xlt@2305d73, I found that it could be significantly simplified, and it turned out we don't even need dynamic memory allocation:
Doing this in theStack@1a40e14 still keeps the worst-case scanning benchmark time down at ~450ms on my machine. It should be pointed out that the "label set" approach doesn't come for free though: one major drawback is that for users with a larger label set, the typical scanning scenario (assuming most txs don't have more than just a few outputs) can be slower at some point, as the number of point additions is higher than with the "BIP" approach. So if we follow your approach, it's better for small-case users and to eliminate the worst-case scanning attack, but users with a lot of labels are likely not happy with it and would very much prefer the "BIP" scanning approach. Not really sure how to deal with that and what to prioritize. (Provide two scanning implementations and let the user pick? 😬 ) I'll anyways try to come up with another "modified example" benchmark to better verify the claims about typical scanning scenarios, depending on the number of labels involved and transaction output count. We have spent a good time on the worst-scanning scenario now, but it would be wrong to solely optimize for that and get the typical scanning scenario out of sight.
I don't agree with that statement, as I can't think of a legit non-adversarial case that would lead to scanning times of several minutes, in particular as this would at the minimum involve creating a large non-standard transaction. When considering user experience, by far the most relevant case I'd consider is the "no match" scenario (one k loop iteration), as it is the most frequent one, followed by the "single match" (two k loop iterations) scenario. Everything above that should be already quite rare in practice.
Good question. Right now, the scratch space API is not even exposed to the user. Btw I don't think dynamic memory allocation is a strict no-go in general, it just seems that there are still some open questions around that area, and waiting until they are all tackled would probably delay the PR for an unforeseeable time. It seems that as of now we can easily avoid dynamic memory allocation anyways though, in both of the proposed implementations. |
Some (very) rough ideas on the trade-off… If we can define a general rule that determines when it’s better to use the label-set scan versus the BIP-based scan, we could do something like: /* Choose algorithm based on this transaction's characteristics */
if (n_labels > 2 * n_tx_outputs) {
/* BIP approach: iterate over outputs, add, and look up in label callback */
use_bip_scan();
} else {
/* Label-set approach: iterate over labels, add, and search in output index */
use_labelset_scan();
}The heuristic above is just a rough estimate. The BIP version performs roughly ~2N point additions, where N is the number of outputs, while the label-set version performs about ~L additions, where L is the number of configured labels. But to build a reliable decision rule, we’d need more benchmark data across a range of (N, L) combinations. |
|
Here is a first attempt at creating a scanning benchmark for the common case scenario (i.e. no matches and hence only one k iteration), comparing the "BIP" and "Label-set" approaches over a combination of L (number of labels to scan for) and N (number of tx outputs): theStack@fef63fd. Running this modified example outputs the following on my arm64 machine: These results should be taken with a large grain of salt, as the numbers fluctuate a lot (visible for the BIP approach, where all lines should have about the same results, as the performance is in theory independent of L; also, for the first two L the results are surprisingly worse, maybe some unintended caching effects for later runs?) but one can at least see the general previously suspected trend that the BIP approach gets worse with increasing N, while the label-set approach gets worse with increasing L. As a very rough first statement, I'd say assuming that users don't use more than a handful (let's say up to 10) labels and transactions with more outputs get more common in the future, using the label-set approach seems superior, and the only reason to keep the BIP approach is to target users that want to scan for a large number of (let's say dozens) of labels. Comments or ideas on how to best reason about this would be much appreciated (also a rough review of the modified benchmark example, it could also be that there are bugs). Maybe someone has an idea how to connect this with some actual historic on-chain stats? The list of L and N values to benchmark for can be adapted by changing the two arrays
Makes sense yeah, thought about something similar. What makes this a bit inconvenient (aside from additional review and maintenance burden, obviously) is that users would have to pass both the labels set and implement and pass in a call-back functions, which seems to result in a bloated API. Maybe it's better to focus on one approach and ship another scan function for "giant label set" power users later? I could be wrong, but I don't think using more than three or four labels would be a very common use-case. |
|
@theStack Thanks very much for this benchmark. It is very insightful. From my understanding, your benchmark confirms that the label-set approach performs better overall and scales more smoothly with the transaction size I also evaluated Montgomery batch inversion via Below is the benchmark commit as measured on my machine (no batch inversion): The numbers after applying batch inversion: The only code change here is using batch inversion in the label-set path. |
This is now available as an actual benchmark using the framework (i.e. implemented in the module's
I agree. It seems to me that, unless there is a really strong reasons to support use-cases with dozens of labels, we should consider switching from the BIP to the Label-set scanning approach. I'm not sure though if larger N values appear frequent enough in blocks nowadays to have a significant influence over the overall scanning time (I'd roughly guess that the vast majority of SP eligible txs wouldn't exceed 10 outputs, though that should be verified), but that could change in the future.
That's great, didn't review this code in detail yet but it seems to be simple enough to be considered on top, if we go with the label set scanning approach; the benchmarks also look very promising, will integrate them and run on my machine as well tomorrow. |
furszy
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A bit late but joining this party full-time now. I just finished the sending part. Will keep reviewing. Cool stuff.
|
|
||
| seckey_sum_scalar = secp256k1_scalar_zero; | ||
| for (i = 0; i < n_plain_seckeys; i++) { | ||
| ret = secp256k1_scalar_set_b32_seckey(&addend, plain_seckeys[i]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In ffffd7f:
what if plain_seckeys[i] is null? (same for taproot_seckeys[I] below).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, will add an ARG_CHECK to check for non-NULL, here and in other API functions (scanning) where similar checks for "array of pointers" elements are missing. I've also opened #1779 to do the same for API functions in master for consistency.
| /* BIP0352 specifies that k is serialized as a 4 byte (32 bit) value, so we check to make | ||
| * sure we are not exceeding the max value for a uint32 before incrementing k. | ||
| * In practice, this should never happen as it would be impossible to create a transaction | ||
| * with this many outputs. | ||
| */ | ||
| if (k < UINT32_MAX) { | ||
| k++; | ||
| } else { | ||
| return 0; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In ffffd7f:
As k depends on n_recipients, wouldn't be simpler to ensure that n_recipients < UINT32_MAX early on the function?
ARG_CHECK(n_recipients > 0 && n_recipients < UINT32_MAX);There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bounding the number of recipients seems reasonable to me (even though by protocol they could exceed the 32-bit range in theory, if they are not going all to the same scan pubkey, and we would at some point allow ~185 GB sized transactions :p). I wonder if we could even just set the type of n_recipients to uint32_t instead of size_t, so the limit is already enforced at compile-time? IIRC there were some previous discussions between @josibake and @jonasnick about that topic (in take 3 or 2 of this PR), will look those up next week before changing.
| typedef struct secp256k1_silentpayments_recipient { | ||
| secp256k1_pubkey scan_pubkey; | ||
| secp256k1_pubkey spend_pubkey; | ||
| size_t index; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
q: Instead of having this weird index field in the public API, wouldn't be simpler to include the generated output directly inside secp256k1_silentpayments_recipient?
E.g.
struct secp256k1_silentpayments_recipient {
/* Inputs */
secp256k1_pubkey in_scan_pubkey;
secp256k1_pubkey in_spend_pubkey;
/* Output */
secp256k1_xonly_pubkey out_generated_output;
}There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this change would be slightly simpler for the implementation, but possibly quite confusing for the user. Note that the passed in _silentpayments_recipient array gets sorted in place (to group by scan pubkey), so we need some way to map the generated outputs back to the original recipients. Doing this by just keeping the original order (that's what the index field is needed for) seems the most obvious (expected?) way. Admittedly with your proposed change an user would have the necessary information to do the mapping themselves, but it seems odd to e.g. pass in two recipients (r1, r2), get the result back in a different order (r2, r1) and then having to find out again at what position each recipient has moved, in order to get the corresponding generated output each.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Admittedly with your proposed change an user would have the necessary information to do the mapping themselves, but it seems odd to e.g. pass in two recipients (r1, r2), get the result back in a different order (r2, r1) and then having to find out again at what position each recipient has moved, in order to get the corresponding generated output each.
If you have all the information, does the initial ordering matter?
================================================================================
Also, I woke up creative.. another idea to remove the index field could be to store the initial ordering inside the generated_outputs array. We basically have a n_recipients size array with elements that are essentially buffers :).
So.. we could do something like:
diff --git a/src/modules/silentpayments/main_impl.h b/src/modules/silentpayments/main_impl.h
--- a/src/modules/silentpayments/main_impl.h (revision ffffd7ff98368b29759cd3d9933896fb9fa69b1f)
+++ b/src/modules/silentpayments/main_impl.h (date 1765038514517)
@@ -185,13 +185,14 @@
const unsigned char * const *plain_seckeys,
size_t n_plain_seckeys
) {
- size_t i, k;
+ size_t i, k, j;
secp256k1_scalar seckey_sum_scalar, addend, input_hash_scalar;
secp256k1_ge prevouts_pubkey_sum_ge;
secp256k1_gej prevouts_pubkey_sum_gej;
unsigned char shared_secret[33];
secp256k1_pubkey current_scan_pubkey;
int ret, sum_is_zero;
+ secp256k1_xonly_pubkey *output = NULL;
/* Sanity check inputs. */
VERIFY_CHECK(ctx != NULL);
@@ -211,9 +212,6 @@
} else {
ARG_CHECK(n_plain_seckeys == 0);
}
- for (i = 0; i < n_recipients; i++) {
- ARG_CHECK(recipients[i]->index == i);
- }
seckey_sum_scalar = secp256k1_scalar_zero;
for (i = 0; i < n_plain_seckeys; i++) {
@@ -268,6 +266,13 @@
return 0;
}
secp256k1_scalar_mul(&seckey_sum_scalar, &seckey_sum_scalar, &input_hash_scalar);
+
+ /* Store recipients ordering prior to sorting the array */
+ for (i = 0; i < n_recipients; i++) {
+ /* Use the output struct as a buffer for storing the pointer of the recipient */
+ memcpy(generated_outputs[i]->data, &recipients[i], sizeof(secp256k1_silentpayments_recipient*));
+ }
+
/* _recipient_sort sorts the array of recipients in place by their scan public keys (lexicographically).
* This ensures that all recipients with the same scan public key are grouped together, as specified in BIP0352.
*
@@ -294,7 +299,17 @@
secp256k1_silentpayments_create_shared_secret(ctx, shared_secret, &pk, &seckey_sum_scalar);
k = 0;
}
- if (!secp256k1_silentpayments_create_output_pubkey(ctx, generated_outputs[recipients[i]->index], shared_secret, &recipients[i]->spend_pubkey, k)) {
+
+ /* Look-up for the correct output to initialize based on the initial ordering */
+ for (j = 0; j < n_recipients; j++) {
+ secp256k1_silentpayments_recipient *ptr = NULL;
+ memcpy(&ptr, generated_outputs[j]->data, sizeof(secp256k1_silentpayments_recipient*));
+ if (ptr == recipients[i]) {
+ output = generated_outputs[j];
+ }
+ }
+
+ if (!secp256k1_silentpayments_create_output_pubkey(ctx, output, shared_secret, &recipients[i]->spend_pubkey, k)) {
secp256k1_scalar_clear(&seckey_sum_scalar);
secp256k1_memclear_explicit(&shared_secret, sizeof(shared_secret));
return 0;
===================================================================
diff --git a/include/secp256k1_silentpayments.h b/include/secp256k1_silentpayments.h
--- a/include/secp256k1_silentpayments.h (revision ffffd7ff98368b29759cd3d9933896fb9fa69b1f)
+++ b/include/secp256k1_silentpayments.h (date 1765038119367)
@@ -45,7 +45,6 @@
typedef struct secp256k1_silentpayments_recipient {
secp256k1_pubkey scan_pubkey;
secp256k1_pubkey spend_pubkey;
- size_t index;
} secp256k1_silentpayments_recipient;
/** Create Silent Payments outputs for recipient(s).
===================================================================
diff --git a/src/modules/silentpayments/tests_impl.h b/src/modules/silentpayments/tests_impl.h
--- a/src/modules/silentpayments/tests_impl.h (revision ffffd7ff98368b29759cd3d9933896fb9fa69b1f)
+++ b/src/modules/silentpayments/tests_impl.h (date 1765038518151)
@@ -98,7 +98,6 @@
for (i = 0; i < 3; i++) {
CHECK(secp256k1_ec_pubkey_parse(CTX, &recipients[i].scan_pubkey, (*sp_addresses[i])[0], 33));
CHECK(secp256k1_ec_pubkey_parse(CTX, &recipients[i].spend_pubkey,(*sp_addresses[i])[1], 33));
- recipients[i].index = i;
recipient_ptrs[i] = &recipients[i];
generated_output_ptrs[i] = &generated_outputs[i];
}
@@ -174,7 +173,6 @@
CHECK(secp256k1_ec_pubkey_parse(CTX, &r[i].scan_pubkey, (*sp_addresses[i])[0], 33));
CHECK(secp256k1_ec_pubkey_parse(CTX, &r[i].spend_pubkey,(*sp_addresses[i])[1], 33));
/* Set the index value incorrectly */
- r[i].index = 0;
rp[i] = &r[i];
op[i] = &o[i];
}
@@ -183,13 +181,6 @@
t[0] = &taproot;
p[0] = ALICE_SECKEY;
- /* Fails if the index is set incorrectly */
- CHECK_ILLEGAL(CTX, secp256k1_silentpayments_sender_create_outputs(CTX, op, rp, 2, SMALLEST_OUTPOINT, NULL, 0, p, 1));
-
- /* Set the index correctly for the next tests */
- for (i = 0; i < 2; i++) {
- r[i].index = i;
- }
CHECK(secp256k1_silentpayments_sender_create_outputs(CTX, op, rp, 2, SMALLEST_OUTPOINT, NULL, 0, p, 1));
/* Check that null arguments are handled */
@@ -257,7 +248,6 @@
CHECK(secp256k1_ec_pubkey_negate(CTX, &neg_spend_pubkey));
r[0].spend_pubkey = neg_spend_pubkey;
for (i = 0; i < 2; i++) {
- r[i].index = i;
rp[i] = &r[i];
}
CHECK(secp256k1_silentpayments_sender_create_outputs(CTX, op, rp, 2, SMALLEST_OUTPOINT, NULL, 0, p, 1) == 0);That being said, this improves the public API at the cost of making the internal code a bit more complex. So I’m not strong on it — I just had some fun thinking about different ways to remove the ugly index field. So no need to take it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you have all the information, does the initial ordering matter?
In theory it doesn't, in practice I think some users would be at least surprised or annoyed by getting the generated outputs back in an order that is different from the passed in recipients (some might even lose money by confusing the amounts for e.g. the actual recipient and the change output).
Also, I woke up creative.. another idea to remove the index field could be to store the initial ordering inside the generated_outputs array. We basically have a n_recipients size array with elements that are essentially buffers :).
That's a neat and indeed very creative idea 😃. (Ab)using a pubkey object's underlying memory for temporarily storing a pointer value feels very hacky though, so not sure if we want to introduce these kind of tricks in general. On the other hand, simplifying the API would be indeed a win, so I'm not fully opposed either. Curious what others think about this.
| if (!secp256k1_silentpayments_calculate_input_hash_scalar(&input_hash_scalar, outpoint_smallest36, &prevouts_pubkey_sum_ge)) { | ||
| secp256k1_scalar_clear(&seckey_sum_scalar); | ||
| return 0; | ||
| } | ||
| secp256k1_scalar_mul(&seckey_sum_scalar, &seckey_sum_scalar, &input_hash_scalar); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In ffffd7f:
This is not realistically possible but.. should also check that the input hash scalar is not the inverse of the aggr sk (aka the result is not 1).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm trying to wrap my head around on what the exact implications would be if that indeed happened. If
(Wouldn't any other small-ish value also be problematic, as guessing
| if (!secp256k1_eckey_pubkey_tweak_add(&output_ge, &output_tweak_scalar)) { | ||
| secp256k1_scalar_clear(&output_tweak_scalar); | ||
| return 0; | ||
| }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In ffffd7f:
Extra semicolon here.
| static void test_recipient_sort_helper(unsigned char (*sp_addresses[3])[2][33], unsigned char (*sp_outputs[3])[32]) { | ||
| unsigned char const *seckey_ptrs[1]; | ||
| secp256k1_silentpayments_recipient recipients[3]; | ||
| const secp256k1_silentpayments_recipient *recipient_ptrs[3]; | ||
| secp256k1_xonly_pubkey generated_outputs[3]; | ||
| secp256k1_xonly_pubkey *generated_output_ptrs[3]; | ||
| unsigned char xonly_ser[32]; | ||
| size_t i; | ||
| int ret; | ||
|
|
||
| seckey_ptrs[0] = ALICE_SECKEY; | ||
| for (i = 0; i < 3; i++) { | ||
| CHECK(secp256k1_ec_pubkey_parse(CTX, &recipients[i].scan_pubkey, (*sp_addresses[i])[0], 33)); | ||
| CHECK(secp256k1_ec_pubkey_parse(CTX, &recipients[i].spend_pubkey,(*sp_addresses[i])[1], 33)); | ||
| recipients[i].index = i; | ||
| recipient_ptrs[i] = &recipients[i]; | ||
| generated_output_ptrs[i] = &generated_outputs[i]; | ||
| } | ||
| ret = secp256k1_silentpayments_sender_create_outputs(CTX, | ||
| generated_output_ptrs, | ||
| recipient_ptrs, 3, | ||
| SMALLEST_OUTPOINT, | ||
| NULL, 0, | ||
| seckey_ptrs, 1 | ||
| ); | ||
| CHECK(ret == 1); | ||
| for (i = 0; i < 3; i++) { | ||
| secp256k1_xonly_pubkey_serialize(CTX, xonly_ser, &generated_outputs[i]); | ||
| CHECK(secp256k1_memcmp_var(xonly_ser, (*sp_outputs[i]), 32) == 0); | ||
| } | ||
| } | ||
|
|
||
| static void test_recipient_sort(void) { | ||
| unsigned char (*sp_addresses[3])[2][33]; | ||
| unsigned char (*sp_outputs[3])[32]; | ||
|
|
||
| /* With a fixed set of addresses and a fixed set of inputs, | ||
| * test that we always get the same outputs, regardless of the ordering | ||
| * of the recipients | ||
| */ | ||
| sp_addresses[0] = &CAROL_ADDRESS; | ||
| sp_addresses[1] = &BOB_ADDRESS; | ||
| sp_addresses[2] = &CAROL_ADDRESS; | ||
|
|
||
| sp_outputs[0] = &CAROL_OUTPUT_ONE; | ||
| sp_outputs[1] = &BOB_OUTPUT; | ||
| sp_outputs[2] = &CAROL_OUTPUT_TWO; | ||
| test_recipient_sort_helper(sp_addresses, sp_outputs); | ||
|
|
||
| sp_addresses[0] = &CAROL_ADDRESS; | ||
| sp_addresses[1] = &CAROL_ADDRESS; | ||
| sp_addresses[2] = &BOB_ADDRESS; | ||
|
|
||
| sp_outputs[0] = &CAROL_OUTPUT_ONE; | ||
| sp_outputs[1] = &CAROL_OUTPUT_TWO; | ||
| sp_outputs[2] = &BOB_OUTPUT; | ||
| test_recipient_sort_helper(sp_addresses, sp_outputs); | ||
|
|
||
| sp_addresses[0] = &BOB_ADDRESS; | ||
| sp_addresses[1] = &CAROL_ADDRESS; | ||
| sp_addresses[2] = &CAROL_ADDRESS; | ||
|
|
||
| /* Note: in this case, the second output for Carol comes before the first. | ||
| * This is because heapsort is an unstable sorting algorithm, i.e., the ordering | ||
| * of identical elements is not guaranteed to be preserved | ||
| */ | ||
| sp_outputs[0] = &BOB_OUTPUT; | ||
| sp_outputs[1] = &CAROL_OUTPUT_TWO; | ||
| sp_outputs[2] = &CAROL_OUTPUT_ONE; | ||
| test_recipient_sort_helper(sp_addresses, sp_outputs); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In ffffd7f:
The heapsort comment was a bit misleading to me. It only matters because we’re hardcoding the expected outputs, but it's not something we actually care about.
The goal of the test is to make sure that shuffling the inputted recipients never changes which outputs are produced, and that each expected output appears exactly once.
So we don’t need to rely on any final ordering assumptions at all; they depend on heapsort’s stability, which (per comment) is unstable. Take the following if you like it:
static void shuffle(unsigned char (*sp_addresses[])[2][33], size_t size) {
size_t i, j;
for (i = size - 1; i > 0; i--) {
unsigned char (*tmp)[2][33] = sp_addresses[i];
j = testrand_bits(8) % (i + 1);
sp_addresses[i] = sp_addresses[j];
sp_addresses[j] = tmp;
}
}
/* Ensure that shuffling the inputted recipients never changes
* which outputs are produced, and that each expected output
* appears exactly once */
static void test_recipient_sort(void) {
int i, j;
unsigned char (*sp_addresses[3])[2][33] = { &CAROL_ADDRESS, &CAROL_ADDRESS, &BOB_ADDRESS };
unsigned char const *seckey_ptrs[1] = { ALICE_SECKEY };
unsigned char xonly_ser[32];
secp256k1_silentpayments_recipient recipients[3];
const secp256k1_silentpayments_recipient *recipient_ptrs[3];
secp256k1_xonly_pubkey generated_outputs[3];
secp256k1_xonly_pubkey *generated_output_ptrs[3];
for (i = 0; i < 6; i++) {
int pos_bob = -1, pos_carol_one = -1, pos_carol_two = -1;
/* Randomize array */
shuffle(sp_addresses, 3);
for (j = 0; j < 3; j++) {
CHECK(secp256k1_ec_pubkey_parse(CTX, &recipients[j].scan_pubkey, (*sp_addresses[j])[0], 33));
CHECK(secp256k1_ec_pubkey_parse(CTX, &recipients[j].spend_pubkey,(*sp_addresses[j])[1], 33));
recipients[j].index = j;
recipient_ptrs[j] = &recipients[j];
generated_output_ptrs[j] = &generated_outputs[j];
}
CHECK(secp256k1_silentpayments_sender_create_outputs(CTX, generated_output_ptrs,
recipient_ptrs, 3,
SMALLEST_OUTPOINT,
NULL, 0,
seckey_ptrs, 1) == 1);
/* Verify output correctness and they all appear once */
for (j = 0; j < 3; j++) {
int* pos = NULL;
CHECK(secp256k1_xonly_pubkey_serialize(CTX, xonly_ser, &generated_outputs[j]));
if (secp256k1_memcmp_var(xonly_ser, BOB_OUTPUT, 32) == 0) {
pos = &pos_bob;
} else if (secp256k1_memcmp_var(xonly_ser, CAROL_OUTPUT_ONE, 32) == 0) {
pos = &pos_carol_one;
} else if (secp256k1_memcmp_var(xonly_ser, CAROL_OUTPUT_TWO, 32) == 0) {
pos = &pos_carol_two;
} else {
TEST_FAILURE("Error: unknown generated output");
}
CHECK(*pos == -1); /* enforce uniqueness */
*pos = j;
}
}
}There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for taking a closer look there. I agree this could be improved.
The goal of the test is to make sure that shuffling the inputted recipients never changes which outputs are produced, and that each expected output appears exactly once.
So we don’t need to rely on any final ordering assumptions at all; they depend on heapsort’s stability, which (per comment) is unstable.
Note that order of the generated outputs does indeed matter and should be tested. The sending API as-is guarantees that for a passed in recipient at position i (recipients[i]), the generated output at the same position (generated_outputs[i]) can be found (and hence be spent later) by that exact recipient. The only thing not guaranteed due to unstable heap sort is the order in which the increasing k values are used for generating those outputs, for recipients that share the same scan public key.
For example, let's say we have three recipients, leading to three outputs:
r0 = (scan_pk, spend_pk_A) -> tx_out0
r1 = (scan_pk, spend_pk_B) -> tx_out1
r2 = (scan_pk, spend_pk_C) -> tx_out2
These three recipients are all in one group, sharing the same ecdh_shared_secret (see BIP-352). The output tweak index field in the recipient data structure (see also previous related comment #1765 (comment)).
So in the quoted test code,
CAROL_OUTPUT_ONE means "output to carol, generated with k=0" and
CAROL_OUTPUT_TWO means "output to carol, generated with k=1" accordingly.
If sp_addresses[i] is assigned to BOB_ADDRESS, sp_outputs[i] must be BOB_OUTPUT (this case is unambiguous, the bob's scan pubkey appears only once in the recipient list).
If sp_adresses[i] is assigned to CAROL_ADDRESS, so_outputs[i] must be either CAROL_OUTPUT_ONE or CAROL_OUTPUT_TWO (that scan pubkey appears twice in the recipient list, so the order within those is ambiguous).
I will give this some more thoughts on how to better express this in the test, suggestions of course welcome.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did this with my brain based on the other comment; the one I'm saying "If you have all the information, does the initial ordering matter?" (#1765 (comment)). I just read your response there. Thanks.
We can easily adapt the generalized test to verify that the initial recipients’ ordering matches the final outputs order. We just need to cache their initial positions during creation (post-shuffle), then check that Bob stays in the same spot and that the two Carols appear exactly once, working around the heapsort instability. That is simple to do.
Overall, this latest test covers strictly more cases than the previous one. And with this approach, we can also easily confirm we’re not biased by the fixed number of outputs we currently have (right now we only verify arrays of length 3). E.g. it would be nice to test against arrays of length two as well. Also, something very important, it would be good to test k is increased for a second output from Bob when there are two outputs for Carol (If k is only ever increased and not reset or something unexpected happens, we do have a major issue).
I'm not expecting this ^^ to fail, it is just test coverage we should have.
| SECP256K1_API SECP256K1_WARN_UNUSED_RESULT int secp256k1_silentpayments_recipient_create_label( | ||
| const secp256k1_context *ctx, | ||
| secp256k1_pubkey *label, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm unsure about the label use-case so far but leaving that topic aside (will keep thinking about it),
I'm wondering if we could use a typedef for the label here instead of relying on the public key structure.
At least for me, public keys are meant for sending coins to, and we don’t want anyone sending coins to the group element label = hash(scan_key || m) * G. Clarifying the intent with a simple typedef might save us a few headaches.
Description
This PR implements BIP352 with scanning limited to full-nodes. Light-client scanning is planned to be added in a separate PR in the future. The following 5 API functions are currently introduced:
Sender side [BIP description]:
secp256k1_silentpayments_sender_create_outputs: given a list ofReceiver side, label creation [BIP description]:
secp256k1_recipient_create_label: given a scan secret key and label integer, calculate the corresponding label_tweak and label public keysecp256k1_recipient_create_labeled_spend_pubkey: given a spend public key and a label public key, create the corresponding labeled spend public keyReceiver side, scanning [BIP description]:
secp256k1_recipient_prevouts_summary_create: given a list ofprevouts_summaryobject needed for scanningsecp256k1_recipient_scan_outputs: given aprevouts_summaryobject, a recipients scan secret key and spend public key, and the relevant transaction outputs (x-only public keys), scan for outputs belonging to the recipients and and return the tweak(s) needed for spending the output(s). Optionally, a label_lookup callback function can be provided to also scan for labels.For a higher-level overview on what these functions exactly do, it's suggested to look at a corresponding Python implementation that was created based on the secp256k1lab project (it passes the test vectors, so this "executable pseudo-code" should be correct).
Changes to the previous take
Based on the latest state of the previous PR #1698 (take 3), the following changes have been made:
_prevout_summary_{parse,serialize},__recipient_create_output_pubkeys), adapted tests and benchmark accordinglyThe scope reduction isn't immediately visible in commit count (only one commit was only introducing light-client relevant functionality and could be completely removed), but the review burden compared #1698 is still significantly lower in terms of LOC, especially in the receiving commit.
Open questions / TODOs
Recent proposals of reducing the worst-case scanning time (see posts by w0xlt and jonasnick, Add BIP352 module (take 3) #1698 (comment) ff.) are not taken into account yet.➡️ solved by marking already-found outputs, see Add "silentpayments" module implementing BIP352 (take 4, limited to full-node scanning) #1765 (comment) ✔️Not providing➡️ solved by mentioning a "don't do this" comment in the API header (same phrasing as in other modules), see Add "silentpayments" module implementing BIP352 (take 4, limited to full-node scanning) #1765 (comment) ✔️prevouts_summary(de)serialization functionality yet in the API poses the risk that users try to do it anyway by treating the opaque object as "serialized". How to cope with that? Is adding a "don't do this" comment in API header sufficient?